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Abstract 

The file caching problem is defined as follows. Given a cache of size k (a positive integer), 
the goal is to minimize the total retrieval cost for the given sequence of requests to files. A file 
/ has size size{f) (a positive integer) and retrieval cost cost{f) (a non-negative number) for 
bringing the file into the cache. A miss or fault occurs when the requested file is not in the 
cache and the file has to be retrieved into the cache by paying the retrieval cost, and some other 
file may have to be removed (evicted) from the cache so that the total size of the files in the 
cache does not exceed k. 

We study the following variants of the online file caching problem. Caching with Rental 
Cost (or Rental Caching): There is a rental cost A (a positive number) for each file in the 
cache at each time unit. The goal is to minimize the sum of the retrieval costs and the rental 
costs. Caching with Zapping: A file can be zapped by paying a zapping cost A > 1. Once 
a file is zapped, all future requests of the file don't incur any cost. The goal is to minimize the 
sum of the retrieval costs and the zapping costs. 

We study these two variants and also the variant which combines these two (rental caching 
with zapping). We present deterministic lower and upper bounds in the competitive-analysis 
framework. We study and extend the online covering algorithm from ,19] to give deterministic 
online algorithms. We also present randomized lower and upper bounds for some of these 
problems. 

1 Introduction 

1.1 Background 

The file caching (or generalized caching) problem is defined as follows. Given a cache of size k (a 
positive integer), the goal is to minimize the total retrieval cost for the given sequence of requests 
to files. A file / has size size{f) (a positive integer) and retrieval cost cost{f) (a non-negative 
number) for bringing the file into the cache. A miss or fault occurs when the requested file is not 
in the cache and the file has to be brought into the cache by paying the retrieval cost. When a file 
is retrieved into the cache, some other file may have to be removed (evicted) from the cache so that 
the total size of the files in the cache does not exceed k. Weighted caching (or weighted paging) is 
the special case when each file has size 1. Paging is the special case when each file has size 1 and 
the retrieval cost for each file is 1. 

An algorithm is online if its response for each request is independent of all future requests. 
Let ALG((7) be the cost of an algorithm ALG on request sequence a, and let OPT(cr) be the 
corresponding optimal offline cost. ALG is a-competitive if, for every request sequence a, ALG(cr) < 
a ■ OPT((t) -|- c, where c is a constant independent of the request sequence. 

In this paper, we study the following variants of the file caching problem in the online setting 
usmg the c;mpetitive-analysis framework 0. 
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Definition 1. Caching with Rental Cost (or Rental Caching); There is a rental cost X (a 
positive number) for each file in the cache at each time step. The goal is to minimize the sum of 
the retrieval costs and the rental costs. In our mode, we allow time steps with no requests. 

Chrobak [rj proposes the rental caching problem and also presents some preliminary results. 
Weighted rental caching (or, weighted rental paging) is a special case of the rental caching problem 
where each file has size 1. Rental paging is a special case where each file has size 1 and the retrieval 
cost for each file is 1. 

The rental caching problem is motivated by the idea of energy efficient caching. Caching systems 
can save power by turning off the memory block that are not being used to store aiiy files. Rental 



Caching models this by charging a rental cost for keeping each file in the cache. See [la] for specific 
applications. 

In section 13.21 we show that the variant of rental caching where the cache has infinite size, is 
closely related to the ski-rental problem. The ski-rental problem is the following. A pair of skis 
can be rented by paying $A per day, or can be bought for the remainder of the ski season by paying 
$B. It is not known when the season is going to end and the goal is to minimize the total money 
spent for the entire season 

Definition 2. Caching with Zapping.- There is an additional cache of infinite size and any file 
can be added to this cache by paying a cost N (a positive number greater than or equal to 1) at 
any time step. When a file is placed into this additional cache, we say the file has been zapped. A 
miss or fault occurs only when the requested file is not present in either cache. Thus, any future 
requests to a file in the additional cache do not incur any cost. The goal is to minimize the sum of 
the retrieval costs and the zapping costs. 

Weighted caching with zapping (or, weighted paging with zapping) is a special case of the caching 
with rental cost problem where each file has size 1. Paging with zapping is a special case where 
each file has size 1 and the retrieval cost for each file is 1. 

These variants generalize the file caching problem. File caching is a special case of rental caching 
where the rental cost is 0. Similarly, caching is a special case of caching with zapping where the 
cost of zapping is arbitrarily large. We also study the variant which combines these two variants: 
rental caching with zapping. In our model, there is no rental cost for files in the additional 
cache. Only the files in the size k primary cache have to pay the rental cost. 

1.2 Previous work 



In 1985 Sleator and Tarjan [17l | introduced the competitive-analysis framework. In 171] they show 
that the well-known paging rules like LeastRecentlyUsed (LRU), FirstInFirstOut (FIFO), 
and FlushWhenFull (FWF) are /c-competitive and that k is the best ratio any deterministic 
online algorithm can achieve for the paging problem. 

Fiat et al. [10] initiate the competitive analysis of paging algorithms in the randomized setting. 
They show a lower bound of H^, where is the kth harmonic number, for any randomized 
algorithm. They give a 2ii'fc-competitive RandomizedMarking algorithm. Achlioptas et al. f\\\ 
show that the tight competitive ratio of RandomizedMarking is 2Hk — 1. McGeoch and Sleator 
[l^ and Achlioptas et al. [H give optimal i^fc-competitive randomized algorithms for paging. 

For weighted caching, Chrobak et al. give a tight A:-competitive deterministic algorithm. For 
the randomized case, Bansal et al. [3] give a tight 0(log /c)-competitive primal-dual algorithm. 
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For file caching, Irani show that the offline problem is NP-hard. For the online case, Irani 
111 ] give results for the bit model {cost{f) = size{f) for each file /) and fault model {cost{f) = 1 for 
each file /). She shows that LRU is {k + l)-competitive for both models. Cao and Irani [6| extend 
the result to file caching. Young [l^ independently gives Landlord algorithm and shows that it 
is A:-competitive for the file caching problem. Irani gives an 0(log^ fc)-competitive randomized 
algorithm for bit and fault models. Bansal et al. [3] give an 0(log A:)-competitive randomized 
algorithm for both the models, and an 0(log^ /c)-competitive randomized algorithm for the general 
case. 



Young [18|] uses online primal-dual analysis to give a ^-competitive deterministic online algo- 
rithm for weighted-caching. Bansal et al. 0,0], Buchbinder and Naor jH] use online primal-dual 
approach to give randomized algorithms for the paging, weighted caching, and file caching prob- 
lems. In a recent work, Adamaszek et al. [2] builds on their online primal-dual approach to give an 
0(log A;)-competitive for the general case. In another recent work Epstein et al. [9| show that this 
online primal-dual approach can be extended to Caching with Rejection. Caching with rejection is 
a variant of file caching where a request to a file, that is not in the cache, can be declined by paying 
a rejection penalty. In this variant, each request is specified as a pair (/, r), where / is the file 
requested and r is the rejection penalty. Note that, caching with rejection is different from caching 
with zapping. In caching with zapping, a file can be zapped at any time step, while in caching with 
rejection, a file can be rejected only at the time step when it is requested. Moreover, a rejected 
file can incur retrieval cost or rejection penalty again in the future, while the zapped file does not 
incur any cost after it is zapped. 



Koufogiannakis and Young [IJ] present a deterministic greedy A-approximation algorithm for 
any covering problem with a submodular and non-decreasing objective function, and with arbitrary 
constraints that are closed upwards, such that each constraint has at most A variables. They show 
that their algorithm is A-competitive for the online version of the problem where the constraints 
are revealed one at a time. Many online caching and paging problems reduce to online covering, 
and consequently, their algorithm generalizes many classical deterministic algorithms for these 
problems. These include LRU and FWF for paging, Balance and Greedy Dual for weighted 
caching. Landlord (a.k.a. Greedy Dual Size) for file caching, and algorithms for Connection 



Caching IJ]. We study this approach and extend it to give deterministic online algorithms for 



the variants of online file caching studied in this paper. 



1.3 Our contributions 



We study rental caching, caching with zapping, and rental caching with zapping. We present 
deterministic and randomized lower and upper bounds for these new variants of paging, weighted 



caching, and caching in the online setting. We use the approach in [IJ] to give deterministic 
algorithms for these online problems. While this approach is general, it doesn't necessarily give 
optimal online algorithms. The direct application of this approach yields sub-optimal algorithms 
in some of the cases we study in this paper. We describe these scenarios and also the appropriate 
modifications to the algorithm to achieve better competitive ratios. 
Table [L3] presents the summary of the results in this paper. 

For rental paging and for fault model, the deterministic upper and lower bounds in this paper 
are tight within constant factors. For the randomized case, the lower and upper bounds are tight 
within constant factors when A is O(p^) and when A > p For weighted rental paging and for 
rental caching, the upper and lower bounds are tight within constant factors when A < for the 
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Table 1: Competitive ratios in this paper 
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deterministic case, and when A is O(p^) or when A > ^ for the randomized case. The bounds 
for the variants with rental cost are within constant factors of the bounds for the variants without 
rental cost when X < for the deterministic case and when A is O(p^) for the randomized case. 

For higher values of A > we show constant lower bounds and matching upper bounds. 

For paging with zapping, weighted paging with zapping, and caching with zapping, the deter- 
ministic lower and upper bounds in this paper are tight within constant factors. 

1.4 Other work on rental paging 

Lopez-Ortiz and Salinger |15], in an independent work, study the rental paging problem. They give 
a deterministic polynomial time algorithm for the offline problem by reducing it to interval weighted 
interval scheduling. They show that any conservative or marking algorithm is /c-competitive and 
that the bound is tight. An algorithm is conservative if it incurs at most k faults on any consecutive 
subsequence of requests that contains at most k distinct pages. A marking algorithm marks each 
page when it is requested, and when it is required to evict a page, it evicts an unmarked page. If 
there are no unmarked pages, it first unmarks all the pages and then removes one. 

For any online algorithm A for paging, define the algorithm for rental paging as follows. A^ 
behaves like A with the modification that any page in the cache that has not been requested for 
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d steps is evicted. They define a class of online algorithms M^, where M is any conservative or 
marking algorithm. They show an upper bound of 2 on the competitive ratio of Mi when A > -e, 

which matches the upper bound in this paper. They show an upper bound of max (k, j^p^^^^rjy) on 

the competitive ratio of Mi when A < This upper bound is weaker than the upper bound we 

present in this paper when ^ < A < p 

Their deterministic lower bound on the competitive ratio for rental paging matches the lower 
bound in this paper. 

They also present experimental results for the performance of various LRU, LRUi_, FWF, 

A 

FWFj_, FIFO, FIFOi, and the optimal offline algorithm. The experimental results agree with the 

A A 

upper bounds shown in the paper. 

They present results only for rental paging and not for weighted rental paging or rental caching. 
They do not study the rental paging problem in the randomized setting. 



2 Online covering approach 



In this section, we give a brief overview of the online covering approach from [ij]. We use this 
approach, with modifications in some cases, to give deterministic algorithms for the variants of 
paging and caching problems in this paper. The idea is to reduce the given problem to online 



covering and then use the online covering algorithm from [ij] as follows. In online covering the 
constraints are revealed one at a time in any order. Whenever the algorithm gets a constraint that 
is not yet satisfied, it raises each variable in the constraint, at the rate inversely proportional to 
the coefficient of the variable in the cost function, until the constraint is satisfied. This algorithm 
is A-competitive, where A is the maximum number of variables in any constraint. 

Now we illustrate this approach for the case of paging. To formulate paging as a Covering Integer 
Linear Program (CILP), we define the following notation and continue using it in the remainder of 
the paper. 

• ft : file requested at time t 

• t' : time of next request to the file requested at time t 

• xt : indicator variable for the event that the file requested at time t was evicted before t' 

• R{t) : set of times of the most recent request to each file until and including time t 



Q{t) : {Q C R(t) — {t} : \Q\ = k}. That is, Q{t) represents all possible ways that the cache 
can be full when ft is requested at time t. 



• T : time of last request 

We formulate paging as follows (LP-Paging): 



mm 



^xt 

t=i 



s.t. yt,yQ€Q{t): ^ [xj > 1 
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Each constraint represents the following. At time t, when ft is requested, for any subset Q of 
Q{t), it must be true that at least of the files, corresponding to the times in Q, must be evicted 
to make space for ft- Clearly, any feasible solution to the paging problem, is a feasible solution 
to Paging-LP. In particular, any optimal solution to the paging problem, is a feasible solution to 
Paging-LP. For any variable x, x* denotes the value of x in the optimal solution. 

Now we describe the CILP based algorithm for paging. Note that, at each time step the 
algorithm may get multiple constraints. The algorithm considers the constraints in arbitrary order. 
When it gets a constraint that is not yet satisfied, it raises each variable om the constraint at 
unit rate until the constraint is satisfied. Whenever a variable reaches 1, the algorithm evicts the 
corresponding file from the cache. If the algorithm gets a constraint that is already satisfied, the 
algorithm does not do anything. We say, the algorithm does work on a constraint, if it wasn't 
already satisfied and the algorithm raises the variables in the constraint, as described above, to 
satisfy it. 

Each constraint in LP-Paging has exactly k variables. Now we show that this algorithm is 
/c-competitive using the following potential function. 



Initially, (j) = OPT and ALG = 0. When the algorithm gets a constraint that is not satisfied, it 
raises each variable in the constraint at rate 1. So, the cost of the algorithm increases at the rate 
k. Also, (p decreases at unit rate because there is at least one variable Xg in the constraint such 
that Xg < X* (otherwise the constraint would already be satisfied). Thus, the algorithm maintains 
the invariant ALG/k + (/> < OPT. Since, 4>>0, ALG < k ■ OPT. 

For the variants in this paper, we use the approach outlined above, but with modifications in 
some cases. When we use the algorithm without any modifications, we omit the proofs for the 
competitive ratio. For these cases, the competitive ratio is the maximum number of variables in 
any constraint on which the algorithm does some work. If we apply any modifications, we present 
complete proofs. 

3 Rental caching 

3.1 Deterministic algorithms using CILP 

In this section, we present a deterministic algorithm, RentalPagingCILP, for rental paging, and 
then extend the algorithm to rental caching. Our algorithm is based on the greedy online covering 
algorithm outlined in Section [2j We use the notation defined in Section [2j In addition, we define 
the following indicator variable to account for renting files. 

• ut^s ■ indicator variable for the event that the file requested at time t pays the rental cost at 




t 



time s < t' 
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The following is the formulation for rental paging (LP-Rental-Paging) : 



T 



min ^ (xt + A ^ yt,s) 

t=l t<s<t' 



s.t. Vt,VQ G Q{t) : [xs\ > 1 



(I) 



yt,t<s<t' : lyt,s] + [xt] > 1 



(11) 



The first set of constraints (I) enforce the cache size at time t (same as the constraints in LP- 
Paging), and the second set of constraints (I/) say that either a file has been evicted or it is being 
rented at time s. We denote them by cache-size constraints and rent-evict constraints, respectively. 

For each request, RENTALPAGiNcCILPgets some cache-size constraints and some rent-evict 
constraints. It considers the rent-evict constraints before the cache-size constraints. Whenever it 
gets a constraint that is not satisfied, it raises each variables in the constraint at the rate inversely 
proportional to its cost in the objective, until the constraint is satisfied. So, the algorithm raises 
Xg at unit rate and yt^s at rate j. 

For some 7 > 0, RentalPagingCILP^ is the algorithm that behaves like RentalPaging- 
CILP with the following modification. RentalPagingCILP^ raises yt^s at the modified rate of 
^. Note that, RentalPagingCILPi and RentalPagingCILP are the same algorithm. 

Theorem 3.1. For rental paging, (a) RentalPagingCILP is 2- competitive when \ > ^, (h) 
RENTALPAGiNGCILPfcA is {1 + competitive when ^ < X < ^, and (c) RentalPagingCILP is 
k-competitive when A < p-. 

Proof, (a) ^ < A: We claim that, at any given time, if all the rent-evict constraints are satisfied, 
the cache-size constraints are satisfied too. We prove this by showing that each file is evicted within 
k steps from its latest request, by considering just the rent-evict constraints. At any given time, 
the algorithm considers the rent-evict constraint corresponding to each file in the cache. In the 
rent-evict constraint at time t, when yt^s goes from to 1, Xg increases by A. So, if the file has 
been in the cache for t time steps since its latest request, Xs = tA, which is at least 1 for t > ^. 
Since, \ < k, Xg will be 1 in at most k steps. Thus, the algorithm does work only on rent-evict 
constraints, each of which has exactly 2 variables. So, RentalPagingCILP is 2-competitive. 

(b) ^ < A < p When the algorithm considers a rent-evict constraints, it raises Xg at unit 
rate, but raises yt^s at rate ^, where 7 = k\. The increment in Xg is ^ at each time step. So, 
for 7 < /cA, within k steps Xs > 1 and hence the corresponding file is evicted. Thus, like in the 
previous case, the algorithm never does any work on the cache-size constraints. Now we show that 
this algorithm is (1 -|- ^)-competitive. The proof is similar to the proof in Section [2j We use the 
following potential function for our proof: 



Consider the rent-evict constraint at time s for the file whose most recent request was at time t. 
When the algorithm raises the variables in the constraint, the cost of the algorithm increases at the 
rate (1 -|- 7). Also, (j) decreases at the rate min(l,7). Thus, the algorithm maintains the invariant 



T 




max 




t=i 



t<s<t' 
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ALG/(1 + 7) + 0/(mm(l,7)) < OPT. It is true initially, because ALG = and (j) = OPT. Since, 
> 0, this implies that ALG < ^^^g^OPT. Also, 7 = A;A < 1. So, ALG < (1 + ^)OPT. 

(c) A < p-: In this case, RentalPagingCILP does work on both cache-size constraints and 
rent-evict constraints, and thus, the algorithm is /c-competitive. □ 

Now we extend the results to rental caching and present the algorithm RentalCachingCILP. 
For rental caching, the linear program is similar to the linear program for rental paging, with 
appropriate changes to take into account the cost and the size of each file. We define Q{t) to take 
into account the file sizes as follows. Q{t) = {Q C R{t) — {t} : k — size{f) < size{Q) < k}, where 
size{Q) = ^f^Q size{ft)- We modify the objective to take into account the cost of files. The 
following is the formulation for rental caching (LP-Rental-Caching): 

T 

min ^ {cost{ft) ■ xt + X ^ yt,s) 

t=l t<s<t' 

s.t. Vt, VQ G Q{t) : ^ size{fs) ■ [xs\ > size{ft) 

s&Q 

yt,t<s<t': [yt,s\ + [xt\ > I 

When RentalCaghingCILP gets a rent-evict constraint that is not yet satisfied, it raises xt 
at rate ^^^^^^^^^ and yt^s at rate j. When it gets a cache-size constraint, it raises Xg at rate ^^J^j ^ ■ 

Theorem 3.2. RentalCachingCILP is k- competitive for rental caching. 

Proof. Since each file has size at least 1, each constraint has at most k variables. So, for the general 
case of rental caching, the algorithm is /^-competitive. □ 

Corollary 3.1. For rental caching for the case of fault model, (a) RentalCachingCILP is 2- 
competitive when A > ^, (b) RENTALCACHiNGCILPfciamMa is (1 -|- -^)- competitive when p- < A < 
p-, and (c) RentalCachingCILP is k-competitive when A < -p-. 

Proof. For the fault model, cost(f) is 1 for each file /. So, the cost function and the rent-evict 
constraints are the same as in case of rental paging with zapping. Thus, the three cases of Theorem 
O still hold. □ 



3.2 Rental caching with infinite cache 

Consider the special case of the rental paging (or caching) problem where the cache has infinite 
size. This is equivalent to the rental caching problem without any cache size constraint. Even 
though there is no cache size constraint, this problem is still interesting because there is a rental 
cost for keeping files in the cache. 

Theorem 3.3. If there is an a-competitive algorithm ALGsr for ski-rental, then there is a (^^)- 
competitive algorithm for rental caching with infinite cache. 

Proof. Consider any file /. We define a phase as follows. A phase starts with a request to / and 
ends at the time step just before the next request to /. When a / is requested, it is either already 
in the cache or it is retrieved and added to the cache. Thus, once a phase starts, the file must be 
present in the cache and the earliest this file can be evicted from the cache is at the next time step. 



8 



Such a phase, excluding the first step, reduces to ski-rental as follows. The cost of renting is A 
and cost of buying is the cost of eviction, which is cost{f). The algorithm doesn't know when the 
phase ends and at each time step it has to decide if it keep renting the file or if it should pay for 
the eviction cost to buy it. 

The algorithm ALGoo for rental caching with infinite cache does the following. For a request to 
file ft at time t, ALGoo brings the file into the cache. Starting at the next time step, it simulates 
ALGsR on ft to decide for how long it keeps the file in the current phase. If ALGsr buys / at any 
time step during the phase, ALGoo evicts it from the cache at that step. The total rental cost of 
ALGoo is same as the total rental cost of ALGsr and the total eviction cost of ALGoo is equal to 
the total cost of buying for ALGsr. 

Let OPTsR be the optimal cost of the ski-rental problem. In a phase, ALGoo cost is at most 
\ + a ■ OPTsR and the optimal cost is OPToo = A + OPTsr. The the competitive ratio of this 
algorithm is at most '''^qpt^^" • Since a > 1, the competitive ratio is at most a. 

□ 

Corollary 3.2. ALG QQ is a 2- competitive deterministic algorithm for rental caching with infinite 
cache. 



Proof. The 2-competitive deterministic algorithm for ski-rental [1^ and Theorem l3.3l together imply 
that ALGoo is 2-competitive. □ 

Corollary 3.3. There is a {-^^)- competitive randomized algorithm for rental caching with infinite 
cache. 



Proof. The (^^)-competitive randomized algorithm for ski-rental [12] and Theorem 13.31 together 
imply that ALGoo is (^^)-competitive. □ 

Theorem 3.4. // there is an a-competitive algorithm ALGsr for ski-rental, then there is an a- 
competitive algorithm for rental paging when A > p 

Corollary 3.4. When X > -^j there is a 2-competitive deterministic algorithm for rental caching. 
Corollary 3.5. When A > ^, there is a {-^j)- competitive randomized algorithm for rental caching. 

3.3 RentalCachingMeta 

Theorem 3.5. // there is an a-competitive algorithm ALGsr for ski-rental, and a (3 -competitive 
algorithm for caching (no rental cost) ALGc, then there is [a + (3) -competitive algorithm for rental 
caching. 

We present the RentalCachingMeta algorithm. Our algorithm uses ALGsr and ALGc to 
generate a solution for rental caching. On an input sequence a and cache size A;, RentalCaching- 
Meta does the following. It simulates ALGc on the input sequence a and cache Ci of size k. In 
parallel, it simulates ALGoo on the request sequence a and cache C2 of infinite size. ALGoo in turn 
simulates ALGsr on each request. At any time, the cache of RentalCachingMeta contains the 
intersection of the files present in caches Ci and C2. 

Claim 3.1. The total size of the items in the cache 0/ RentalCachingMeta never exceeds k. 
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Proof. Total size of all items in the cache of ALGc is at least the total size of all items in the cache 
of RentalCachingMeta. This proves our claim, because ALGc maintains the invariant that the 
total size of items in the cache is at most k. □ 

Claim 3.2. ^[RentalCachingMeta] < £;[ALGsr] + £;[ALGc] 

Proof RentalCachingMeta evicts a file, when at least one of ALGsr and ALGc evicted the 
file. For each eviction, we charge the cost of eviction for RentalCachingMeta to the algorithm 
that evicted the file, breaking ties arbitrarily. We charge the rental cost of RentalCachingMeta 
to the rental cost of ALGsr. This proves our claim. □ 

Also, ^[ALGsr] < a ■ OPTsR < a ■ OPT, and ^[ALGc] < /3 • OPTc < /3 • OPT, where OPTgR 
denotes the optimal cost for rental caching with infinite cache, OPTc denotes the optimal cost for 
caching, and OPT denotes the optimal cost for rental caching. So, £' [RentalCachingMeta] < 
(a+/3)0PT, and hence, RentalCachingMeta is (a+/3)-competitive algorithm for rental caching. 

If both ALGc and ALGsr are deterministic, RentalCachingMeta is also deterministic, 
otherwise it is randomized. Theorem 13.51 implies the following two corollaries. 

Corollary 3.6. The 2- competitive deterministic online algorithm for ski-rental f^] and the k- 
competitive deterministic online algorithm for caching fljl J. give a {k + 2) -competitive deterministic 
online algorithm for rental caching. 

Corollary 3.7. The [-^-^)- competitive randomized online algorithm for ski-rental fl^] and the 
Hk- competitive randomized online algorithm for caching give a {Hk + ■^^) -competitive 

randomized online algorithm for rental caching. 



3.4 Lower bounds 

Theorem 3.6. The competitive ratio of any deterministic algorithm for rental paging is at least 
(a) 2 when A > and (h) ^ppj when \ <\- 

Proof, (a) A > p Corollary 13.41 implies a deterministic lower bound of 2. 

(b) ^ > A: The adversary requests files from the set {1,2,3, ••• ,fc + 1}. At each step, the 
adversary requests a file that is not present in the cache of the algorithm. The algorithm faults at 
each time step and pays at least A at each step. OPT pays the rental cost to keep k items in the 
cache at each time step and faults once in k steps. So, the ratio is at least j^^j- For sufficiently 
small A, the ratio tends to k. □ 

Theorem 3.7. The competitive ratio of any randomized algorithm for rental paging is at least (a) 
^ when \>land (h) when A < i. 

Proof, (a) A > p Corollary 13.51 implies a randomized lower bound of 

(b) A < p The adversary requests files from a set of A: + 1. At each step, the adversary requests 
a file with uniform probability over all files except the file requested at the previous step. We split 
the request sequence into phases as follows. A phase is the longest request sequence with at most 
k distinct requests, and starts immediately after the previous phase ends. 

We now show that the expected length of each phase is kH^- When i files have been requested, 
the probability of requesting a file that has not been requested in the phase is Thus the 

expected length of a phase is f k-^i+i ) ~ 



10 



Next, we show that the algorithm keeps its cache full to minimize the expected cost in a phase. 
Assume that i distinct files have been requested in the phase and the algorithm has p < k files in 
the cache. At the next time step, the algorithm faults with a probability and pays a rental 

cost cA. So, the expected cost of the algorithm is 1 + ^ — — A). Since, ^ > A, the cost is 
minimized when p = k. So, it pays ^ eviction cost and kX rental cost at each step. OPT pays the 

same rental cost, but faults once in each phase. So, for each phase, the ratio is at least ^^fj^^^- 
For sufficiently small A, the ratio tends to Hf.. □ 

4 Caching with zapping 

4.1 Deterministic algorithms using CILP 

In this section, we present a deterministic algorithm, ZappingPagingCILP, for paging with zap- 
ping, and then extend the algorithm to caching with zapping. We introduce another indicator 
variable for zapping of files. 

• Zf-. indicator variable for the event that the file / has been zapped 

We formulate paging with zapping as follows (LP-Paging-Zapping): 

T 

min xt + N Zf 

t=l fdF 

s.t. Vt,VQeg(t): (^L^,J + Lz/j) + L^/J >1 

The constraints say that either ft is zapped, or at least one file in the cache is either zapped or 
evicted. Whenever ZappingPagingCILP gets a constraint that is not satisfied, raises Xg at unit 
rate and Zf^ at rate 

Theorem 4.1. ZappingPagingCILP is {2k + l)-competitive. 

Proof. Each constraint has 2k + 1 variables. Thus, ZappingPagingCILP is {2k + l)-competitive. 

□ 

Now we extend this approach to caching with zapping and present the algorithm Zapping- 
CachingCILP. We define Q{t) same as we defined for rental caching in section [3Tl We have the 
following linear program (LP-Caching-Zapping) for caching with zapping: 

T 

min cost{ft) ■ Xt + N Zf 

t=l /SF 

s.t. Vt,V(3 G Q{t) : (^^min([xsj + L^/J,l) • size{fs)^ + [z/J • size{ft) > size{ft) 

For each not yet satisfied constraint that ZappingCachingCILP gets, it raises Xg at rate 
and Zf^ at rate i. 
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Theorem 4.2. ZappingCachingCILP is {2k + 1) -competitive. 

Proof. Since each file has size at least 1, each constraint has at most 2k + 1 variables. Thus, the 
ZappingCachingCILP is {2k + l)-competitive. □ 

Theorem 4.3. The algorithm that zaps the file when it is requested for the first time is N- 
competitive. 

Proof. Let the total number of distinct files requested be T. The total cost of OPT is at least T (1 
to bring each file in the cache). The total cost of the algorithm is at most NT. So, the ratio is at 
most N. □ 



4.2 Lower Bounds 

Theorem 4.4. For paging with zapping, the competitive ratio of any deterministic algorithm is at 

J 2Nk+N-{k+l) 
least N+2k 

Proof. The adversary maintains a set of + 1 distinct files at all times. Every time a file is zapped 
by the algorithm, it is replaced in the set by a new file that has never been requested by the 
adversary. At each time step, the adversary requests a file that is not present in the cache of the 
algorithm. Wc define a zap-phase as follows. A zap-phasc ends every time a file is zapped and the 
following request marks the beginning of the next zap-phase. The first zap-phase starts with the 
first request of the input sequence. We define a round as follows. The first round starts with the 
first request of the input sequence. A round ends when the algorithm has zapped the all of the 
k + 1 files that were requested in the first k + l time steps in that round (some other files may have 
been zapped too). The total number of files zapped in a round is at least k + l. The adversary 
repeats the process for a large number of rounds. 

Now we show the lower bound on the competitive ratio in each round. Consider any round. 
Let T > k + 1 he the total number of files zapped by the algorithm in the round. Let Hj be the 
length of zap-phase j, 1 < j <T. 

Any deterministic algorithm faults at each time step and zaps a total of T files. So, the cost of 
any deterministic algorithm is at least, ALG = NT + iz^^i {Hj - 1) = {N - 1)T + X;J=i Hj. Note 

that, J2j=i Hj > T. 

When Yl'j=i Hj = T, the algorithm zaps each file when it is requested for the first time. In this 
case, adversary requests a new file at each step. OPT pays at most min (1, A'') at each step while 
the algorithm pays N at each step. For A'" > 1, the ratio is at least N. 

Now we assume that Y^J=i Hj > T. Consider the offiine algorithm T which, on any request 
sequence, does one of the following: (a) Does not zap any file, or (b) Chooses one file from the set 
of the first k+l files requested and zaps it at the first step. 

If J-" doesn't zap any files, in the first zap-phase it pays k to bring the first k files into the cache 
and then pays at most ["^^^1 in the remainder of the first phase. For any zap-phase j > 1, T 
pays at most f^]. 
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3=i 

i=2 

.7 = 1 



If J- zaps 1 file, it incurs k faults in the first phase and 1 fault in each phase after that. It also 
pays N for zapping 1 file. The total cost in this case is {k) + {T — 1) + N . 
Since cost{F) > OPT, the competitive ratio is at least 



For any given A'' and fe, and for T > fc + 1, the ratio is minimized when T = k + 1. So, the 
competitive ratio is at least 



5 Rental paging with zapping 
5.1 Deterministic algorithm using CILP 

In this section we present algorithm RentalZappingPagingCILP for rental paging with zapping. 
We use the same notations defined in the previous sections. The cache size constraints are 

exactly the same as in case of paging with zapping. The rent-evict constraints arc modified to have 
variables for eviction, renting, and zapping. We have the following formulation (LP-Paging-Rental- 
Zapping) : 



min (A; + r - 1 + Ejli ^ , + T + AT - 1) 



. The ratio is minimized when k+T—l+Y^=i ^ = k+T+N—1. Simplifying gives, YlJ=i = 





T 



min ^{xt + X J2 yt,s) + Nj2^f 



t=i t<s<t' feF 



s.t. Vi,VQGQ(t): {J2l^s\ + [zfj) + [zf,\>l 



{III) 



yt,t<s<t': lyt,s\ + [xt\ + lzfj>l 



{IV) 
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We refer to (///) and {IV) by cache-size constraints and rent- evict- zap constraints, respectively. 
RentalZappingPagingCILP is similar to RentalPagingCILP, and considers the rent-evict- 
zap constraints before the cache-size constraints. Whenever the algorithm gets a constraint that is 
not satisfied, it raises all the variables in the constraint as follows. It raises Xs at unit rate, yt^s at 
rate j, and y/ at rate 

We define RentalZappingPagingCILP^ as RentalZappingPagingCILP with the follow- 
ing modification. RentalZappingPagingCILP^ raises yt^s at the modified rate of j, where 7 > 0. 

Theorem 5.1. For rental paging with zapping, (a) RentalZappingPagingCILP is 3 -competitive 
when A > |, (b) RentalZappingPagingCILP^a is (1 + ^) -competitive when ^ < A < p-, and 
(c) RentalZappingPagingCILP is {2k + 1)- competitive when A < -p. 

Proof, (a) ^ < A: We claim that, at any given time, if all the rent-evict-zap constraints are satisfied, 
the cache-size constraints are satisfied too. We prove this by showing that each file is evicted within 
k steps from its latest request, by considering just the rent-evict-zap constraints. At any given time, 
the algorithm considers the rent-evict-zap constraint corresponding to each file in the cache. 

Consider the rent-evict-zap constraint at time s for the file whose latest request was at time t. 
For this constraint, when yt^s goes from to 1, xt increases by A. So, the file has been in the cache 
for s time steps since its latest request and xt = sX. This value is at least 1 if the file has been in the 
cache for s > ;^ steps. Since, k > j;^ is at least 1 when s = k, and consequently all the cache-size 
constraints where Xs participates will be satisfied. Note that, if Zf^ is 1 before xt is 1, the cache-size 
constraints are still satisfied. Thus, the algorithm does work only on rent-evict-zap constraints, 
each of which has exactly 3 variables. Thus, RentalZappingPagingCILP is 3-competitive. 

(b) ^ < A < p When the algorithm considers a rent-evict-zap constraints, it raises Xs at unit 
rate, but raises yt^s at rate j, where 7 = kX. The increment in Xs is ^ at each time step. So, 
for 7 < kX, within k steps > 1 and hence the corresponding file is evicted. Thus, like in the 
previous case, the algorithm never does any work on the cache-size constraints. Now we show that 
this algorithm is (1 + ^)-competitive. We use the following potential function for our proof: 

T 

= ^ (^max(x* - xt,0) + ^ Amax (y*,, - y*,^, 0)^ + ^ iV max (zj - z/, 0) 

t=l t<s<t' f&F 

Consider the rent-evict-zap constraint at time s for the file whose most recent request was at 
time t. 

When the algorithm raises the variables in the constraint, the cost of the algorithm increases 
at the rate (2 -|- 7). Also, (j) decreases at the rate min(l,7). Thus, the algorithm maintains the 
invariant ALG/(2 + 7) + 0/(min(l,7)) < OPT. It is true initially, because ALG = and cj) = OPT. 
Since, > 0, this implies that ALG < J+J^^ OVT. Also, 7 = A:A < 1. So, ALG < (1 + ^)OPT. 

(c) A < p-: In this case, RentalZappingPagingCILP does work on both cache-size con- 
straints and rent-evict-zap constraints, and thus, the algorithm is {2k + l)-competitive. □ 

To extend the algorithm to rental caching with zapping, we combine the ideas from Sections 13. II 
and l4.1[ We modify Q{t) to account for file sizes. Q{t) = {Q C R{t) — {t} : k — size{f) < size{Q) < 
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k}. The following is the formulation of rental caching with zapping (LP-Caching- Rental-Zapping): 



min ^ \^cost{ft) ■ xt + X ^ yt,sj + N'^Zf 
t=i t<s<t' feF 

s.t. Vt, VQ G Q{t) : min([xsj + [zfj,!] ■ size{fs)^ + L^/J • size{ft) > size{ft) 

seQ 

yt,t<s<t' : [yt,s\ + [xt\ + [zfJ > 1 {IV) 

When RentalZappingCachingCILP gets a rent-evict-zap constraint that is not yet satisfied, 
it raises xt at rate cost{ft) ' ^^'^ ^^^^ I' ^^'^ '^/t ^^^^ When it gets a cache-size constraint, 
it raises Xg at rate ^^l^^j ^ and at rate 

Theorem 5.2. RentalZappingCachingCILP is (2k + 1) -competitive for rental caching with 
zapping. 

Proof. Since each file has size at least 1, each cache-size constraint has at most {2k + 1) variables 
and each rent-evict-zap constraint has exactly 3 variables. So, for the general case of caching with 
zapping, the algorithm is {2k + l)-competitive. □ 

Corollary 5.1. For rental caching with zapping for the case of fault model, (a) RentalZapping- 
CachingCILP is 3-competitive when A > |, (b) RENTALZAPPlNGCACHlNcCILPfcA is (1 + ^)- 
competitive when p- < A < p-, and (c) RentalZappingCachingCILP is {2k + 1)- competitive 
when A < p. 

Proof. For the fault model, cost{f) is 1 for each file /. So, the cost function and the rent-evict-zap 
constraints are the same as in case of rental paging with zapping. Thus, the three cases of Theorem 
[OstiU hold. □ 



5.2 RentalZappingCachingMeta 

Analogous to Theorem 13.51 we have the following theorem for rental caching with zapping. 

Theorem 5.3. If there is an a-competitive algorithm ALGg^ for ski-rental, and a f3- competitive al- 
gorithm for caching with zapping (no rental cost) ALGz, then there is {a + j3)- competitive algorithm 
for caching with zapping and rental cost. 

We present the RentalZappingCachingMeta algorithm. Our algorithm uses ALGsr and 
ALGz to generate a solution for rental caching with zapping. On an input sequence a and cache size 
k, RentalZappingCachingMeta does the following. It simulates ALGz on the input sequence 
a and cache Ci of size k. In parallel, it simulates ALGqo on the request sequence a and cache 
C2 of infinite size. ALGqo in turn simulates ALGsr on each request. At any time, the cache of 
RentalZappingCachingMeta contains the intersection of the files present in caches Ci and C2. 
If ALGz nukes a file, RentalZappingCachingMeta nukes it. 

Claim 5.1. The total size of the items in the cache of RentalZappingCachingMeta never 
exceeds k. 
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Proof. Total size of all items in the cache of ALGz is at least the total size of all items in the cache 
of RentalZappingCachingMeta. This proves our claim, because ALGz maintains the invariant 
that the total size of items in the cache is at most k. □ 

Claim 5.2. ^[RentalZappingCachingMeta] < £;[ALGsr] + -E[ALGz] 

Proof. RentalZappingCachingMeta evicts a file, when at least one of ALGsr and ALGz evicted 
the file. For each eviction, charge the cost of eviction for RentalZappingCachingMeta to the 
algorithm that evicted the file, breaking ties arbitrarily. Charge the cost of zapping to ALGz and 
charge the rental cost to the rental cost of ALGsr. This proves our claim. □ 

Also, ^[ALGsr] < a ■ OPTsR < a ■ OPT, and ^[ALGz] < /? • OPTz < /3 • OPT, where OPTgR 
denotes the optimal cost for rental caching with infinite cache, OPTz denotes the optimal cost 
for caching with zapping, and OPT denotes the optimal cost for rental caching with zapping. So, 
^[RentalZappingCachingMeta] < (a + /3)0PT, and hence, RentalZappingCachingMeta 
is (a + /3)-competitive algorithm for rental caching. If both ALGz and ALGsr are deterministic, 
RentalCachingMeta is also deterministic, otherwise it is randomized. 

6 Conclusions and further directions 

We present lower and upper bounds, in deterministic and randomized settings, for rental paging and 
rental caching. For most cases, the lower and upper bounds are tight up to constant factors. When 
-£rjj- < < i> there is a gap between the randomized lower and upper bounds shown in this paper. 
The lower bounds in this paper assume that the cache of OPT is always full, and consequently, in 
each phase is OPT's rental cost is no longer 0(OPT's eviction cost). It may be possible to show 
better lower bounds using a modified analysis or possibly by using another adversary strategy. 

The deterministic lower and upper bounds for paging with zapping are tight up to constant 
factors. The next step would be to study randomized lower bounds and randomized algorithms for 
caching with zapping. 

For rental caching with zapping, we present the upper bounds in both deterministic and ran- 
domized settings. It would be interesting to study the lower bounds and . 

The models in this paper assume uniform rental cost and uniform zapping cost in this study. 
Note that, in our model for rental caching, the total rental cost depends only on the size of a file. 
A natural extension would be to consider models with (arbitrary) non-uniform rental and zapping 
costs. 

The CILP based approach by Koufogiannakis and Young [3] is a general and elegant approach 
for deriving deterministic algorithms for online paging and caching problems. We use this approach 
for all the new variants studied in this paper. The algorithms thus derived may not be optimal, 
as we show for the case of rental paging (or caching) and also for rental paging (or caching) with 
zapping. For the problems in this paper, we were able to apply simple modifications to achieve 
upper bounds within constant factors on the lower bounds. 

The primal-dual approach in a i a a a is a powerful framework for deriving randomized 
algorithms for online caching problems. It would be interesting to investigate if their approach can 
be used to give randomized algorithms for the variants studied in this paper. 
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